notebook.community

References

Stanford NLP Lecture
- http://cs224d.stanford.edu/syllabus.html

Data

babi data for
- http://www.thespermwhale.com/jaseweston/
  - http://www.thespermwhale.com/jaseweston/babi/tasks_1-20_v1-2.tar.gz

Memory Networks Facebook AI

Memory Networks (2014)

End-To-End Memory Networks

Source Code
- https://github.com/facebook/MemNN
Differential Version Of Memory Networks
Two grand challenges in artifical intelligence research
- Multiple computational steps in the service of answering a question or completing a task
- Long term dependencies in sequential data
Because the function from input to output is smooth, we can easily compute gradients and back-propagate through it.

NTM DeepMind

Neural Turing Machine (2014)

Human Cognition VS Computing
- Rule-based manipulation VS Simple Program
- Short-term storage of information VS Program arguments
- -> Working Memormy VS "NTM"

!!! bit #7, bit #8 in input are delimiter bits
- On bit #7 means "input start"
- On bit #8 means "input end and start result" such as word in seq2seq
- insert zero bits in inputs while outputs are generated

REINFORCEMENT LEARNING NEURAL TURING MACHINES

source code
- https://github.com/ilyasu123/rlntm

Hybrid computing using a neural network with dynamic external memory (DNC) (2016)

Unofficial Source code
- https://github.com/Mostafa-Samir/DNC-tensorflow
NTM vs DNC
- Same at target level
- Implementation of addressing method + introduction of memory allocation method
- DNC is better for accuracy (comparison in bAbI task)

Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes

SAM (Sparse Access Model)
- Upgrade NTM using ANN for large size external moemory
  - "Scaling" = Large size
  - Approximate Nearest Neighbor (ANN) $\mathcal{O}(\log N)$ instead of linear search $\mathcal{O}(N)$
  - K-Nearest Neighbor(KNN)
    - K sparse number
- c.f. Sparse Differentiable Neural Computer (SDNC) is upgrade version of DNC

Pointer Networks (2015)

Ask Me Anything: Dynamic Memory Networks for Natural Language Processing (DMN, Dynamic Memory Networks) (2015)

"Hierarchical Memory Networks" (ICLR 2017)

Reduce memory seen by softmax by expressing memory hierarchically

"Dynamic NTM with Continuous and Discrete Addressing Schemes" ICLR 2017

Based on REINFORCE hard attention and softmax Soft attention is used in combination I do not really understand

"Lie Access Neural Turing Machine" ICLR 2017

Addressing using Lee group I can move the head naturally



In [ ]: